training data
Country:
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Israel (0.04)
Technology:
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Country:
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
Technology:
Country:
- Asia > Singapore (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- (4 more...)
Technology:
Country:
- North America > United States > Virginia (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California (0.04)
- (2 more...)
Industry:
- Information Technology > Security & Privacy (1.00)
- Law (0.68)
Technology:
Country:
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States (0.04)
Technology:
Country:
- North America > United States > Indiana (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Technology:
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective Huayang Li Tian Lan Zihao Fu Deng Cai Lemao Liu Nigel Collier
In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the degeneration issue and the presence of repetitions in training data. Subsequent experiments also demonstrate that by selectively dropping out the attention to repetitive words in training data, degeneration can be significantly minimized.
Country:
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (9 more...)
Technology:
Country:
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > France (0.04)
- Asia > Middle East > Jordan (0.04)
Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)